import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import tensorflow as tf
import pandas as pd
import tensorflow.keras as keras
import numpy as np
import matplotlib.pyplot as plt
import tensorflow_addons as tfa
import PIL
import re
import time
from kaggle_datasets import KaggleDatasets
from IPython import display
from tensorflow.keras import layers
from tensorflow.keras import optimizers
from tensorflow.keras.utils import plot_model
from tensorflow.keras.initializers import RandomNormal
from tensorflow.keras.models import Sequential, Model, load_model
from tensorflow.keras.layers import Conv2D, Conv2DTranspose, Dense, Flatten, Reshape
from tensorflow.keras.layers import BatchNormalization, Dropout
from tensorflow.keras.layers import ReLU, LeakyReLU, Activation
from tensorflow.keras.optimizers import Adam
try:
tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
print('Device:', tpu.master())
tf.config.experimental_connect_to_cluster(tpu)
tf.tpu.experimental.initialize_tpu_system(tpu)
strategy = tf.distribute.TPUStrategy(tpu)
except Exception as e:
print("can't initialize tpu, using default, exception: " + str(e))
strategy = tf.distribute.get_strategy()
print('Number of replicas:', strategy.num_replicas_in_sync)
AUTOTUNE = tf.data.experimental.AUTOTUNE
from PIL import Image
import shutil
Device: grpc://10.0.0.2:8470 Number of replicas: 8
A generative adversarial network (GAN) is a generative model that defines an adversarial net framework and is composed of two CNN models, namely a generator and a discriminator, with the goal of generating new realistic images when given a set of training images. These two models act as adversaries of each other: the generator learns to generate new fake images that look like real images (starting with random noise) while the discriminator learns to determine whether a sample image is a real or a fake image. The two models are trained together in a zero-sum game, adversarially, and overtime, the generator gets better at generating images that are super close to real images and discriminator gets better at differentiating them. The process reaches equilibrium when the discriminator can no longer distinguish real images from fakes.
Source: Google
In this project, we will build and train a Deep Convolutional Generative Adversarial Network (DCGAN) with Keras to generate images of Monet-style.
DCGAN is a type of GAN that uses convolutional neural networks (CNNs) as the generator and discriminator. CNNs are specifically designed for image recognition tasks and are well-suited for generating images using GANs. DCGAN includes several architectural changes compared to a regular GAN. It uses transposed convolutional layers for the generator instead of fully connected layers, and replaces pooling layers with strided convolutions. It also uses batch normalization to stabilize the training process and prevents the generator from collapsing.
Source: towardsdatascience.com
As we can see, the Discriminator model is just a Convolutional classification model. In contrast, the Generator model is more complex as it learns to convert latent inputs into an actual image with the help of Transposed and regular Convolutions. In summary, while both GAN and DCGAN are used for generating new data, DCGAN specifically uses convolutional neural networks as the generator and discriminator, and includes several architectural changes to improve the stability and quality of generated data.
There are 4 major steps in the training:
In this project, I use a dataset from Kaggle, was downloaded from the link:
https://www.kaggle.com/competitions/gan-getting-started/data
The dataset contains four directories: monet_tfrec, photo_tfrec, monet_jpg, and photo_jpg. The monet_tfrec and monet_jpg directories contain the same painting images, and the photo_tfrec and photo_jpg directories contain the same photos.
The monet directories contain Monet paintings. We will use these images to train our model.
The photo directories contain photos. We will add Monet-style to these images and submit our generated jpeg images as a zip file.
Files
monet_jpg - 300 Monet paintings sized 256x256 in JPEG format
monet_tfrec - 300 Monet paintings sized 256x256 in TFRecord format
photo_jpg - 7028 photos sized 256x256 in JPEG format
photo_tfrec - 7028 photos sized 256x256 in TFRecord format
Load in the data by following the Monet CycleGAN Tutorial.
# load in the files of the TFRecords
gcs_path = KaggleDatasets().get_gcs_path()
monet_file = tf.io.gfile.glob(str(gcs_path + '/monet_tfrec/*.tfrec'))
print('The number of Monet TFRecord Files:', len(monet_file))
photo_file = tf.io.gfile.glob(str(gcs_path + '/photo_tfrec/*.tfrec'))
print('The number of Photo TFRecord Files:', len(photo_file))
num_monet_samples = np.sum([int(re.compile(r"-([0-9]*)\.").search(filename).group(1)) for filename in monet_file])
print(f'The number of Monet image files: {num_monet_samples}')
num_photo_samples = np.sum([int(re.compile(r"-([0-9]*)\.").search(filename).group(1)) for filename in photo_file])
print(f'The number of Photo image files: {num_photo_samples}')
The number of Monet TFRecord Files: 5 The number of Photo TFRecord Files: 20 The number of Monet image files: 300 The number of Photo image files: 7038
# return the image from the TFRecord
image_size = [256, 256]
def decode_image(image):
image = tf.image.decode_jpeg(image, channels=3)
image = (tf.cast(image, tf.float32) / 127.5) - 1
image = tf.reshape(image, [*image_size, 3])
return image
def read_tfrecord(sample):
tfrecord_format = {
"image_name": tf.io.FixedLenFeature([], tf.string),
"image": tf.io.FixedLenFeature([], tf.string),
"target": tf.io.FixedLenFeature([], tf.string)
}
sample = tf.io.parse_single_example(sample, tfrecord_format)
image = decode_image(sample['image'])
return image
# define the function to extract the image from the files
def load_data(filenames, labeled=True, ordered=False):
data = tf.data.TFRecordDataset(filenames)
data = data.map(read_tfrecord, num_parallel_calls=AUTOTUNE)
return data
# load in the datasets
monet_ds = load_data(monet_file, labeled=True).batch(32)
photo_ds = load_data(photo_file, labeled=True).batch(32)
# Create iterators
sample_monet = next(iter(monet_ds))
sample_photo = next(iter(photo_ds))
# view shape of the datasets
print(sample_monet.shape)
print(sample_photo.shape)
(32, 256, 256, 3) (32, 256, 256, 3)
# define visualization function to view image
def visualize_images(example):
plt.figure(figsize = (10, 10))
for i in range(25):
ax = plt.subplot(5, 5, i + 1)
plt.imshow(example[i] * 0.5 + 0.5)
plt.axis("off")
# Visualize some first images from the monet dataset
visualize_images(sample_monet)
# Visualize some first images from the photo dataset
visualize_images(sample_photo)
DC GAN is one of the most used, powerful, and successful types of GAN architecture. It is implemented with help of ConvNets in place of a Multi-layered perceptron. The ConvNets use a convolutional stride and are built without max pooling and layers in this network are not completely connected.
The generator network takes random Gaussian noise and maps it into input images such that the discriminator cannot tell which images came from the dataset and which images came from the generator.
Let’s define our generator model architect:
The generator uses tf.keras.layers.Conv2DTranspose (upsampling) layers to produce an image from a seed (random noise). Start with a Dense layer that takes this seed as input, then upsample several times until we reach the desired image size of 256x256x3. Notice the tf.keras.layers.LeakyReLU activation for each layer, except the output layer which uses tanh.
# create a function to build the generator model
def create_generator():
model = Sequential(name="Generator")
# Hidden Layer 1: Start with 16 x 16 image
n_nodes = 16 * 16 * 512 # number of nodes in the first hidden layer
model.add(Dense(n_nodes, input_shape=(100,), name='Generator-Hidden-Layer-1'))
model.add(Reshape((16, 16, 512), name='Generator-Hidden-Layer-Reshape-1'))
# Hidden Layer 2: Upsample to 32 x 32
model.add(Conv2DTranspose(filters=256, kernel_size=(3, 3), strides=(2, 2), padding='same', name='Generator-Hidden-Layer-2'))
model.add(LeakyReLU(alpha=0.2))
# Hidden Layer 3: Upsample to 64 x 64
model.add(Conv2DTranspose(filters=128, kernel_size=(3, 3), strides=(2, 2), padding='same', name='Generator-Hidden-Layer-3'))
model.add(LeakyReLU(alpha=0.2))
# Hidden Layer 4: Upsample to 128 x 128
model.add(Conv2DTranspose(filters=64, kernel_size=(3, 3), strides=(2, 2), padding='same', name='Generator-Hidden-Layer-4'))
model.add(LeakyReLU(alpha=0.2))
# Hidden Layer 5: Upsample to 256 x 256
model.add(Conv2DTranspose(filters=32, kernel_size=(3, 3), strides=(2, 2), padding='same', name='Generator-Hidden-Layer-5'))
model.add(LeakyReLU(alpha=0.2))
# Output Layer: we use 3 filters because we have 3 channels for a color image.
model.add(Conv2DTranspose(3, kernel_size=(3, 3), activation='tanh', strides=(1, 1), padding='same', name='Generator-Output-Layer'))
return model
# Use the noise vector to create an image. The generator is still untrained here!
with strategy.scope():
generator = create_generator()
# Show model summary
generator.summary()
Model: "Generator" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= Generator-Hidden-Layer-1 (De (None, 131072) 13238272 _________________________________________________________________ Generator-Hidden-Layer-Resha (None, 16, 16, 512) 0 _________________________________________________________________ Generator-Hidden-Layer-2 (Co (None, 32, 32, 256) 1179904 _________________________________________________________________ leaky_re_lu (LeakyReLU) (None, 32, 32, 256) 0 _________________________________________________________________ Generator-Hidden-Layer-3 (Co (None, 64, 64, 128) 295040 _________________________________________________________________ leaky_re_lu_1 (LeakyReLU) (None, 64, 64, 128) 0 _________________________________________________________________ Generator-Hidden-Layer-4 (Co (None, 128, 128, 64) 73792 _________________________________________________________________ leaky_re_lu_2 (LeakyReLU) (None, 128, 128, 64) 0 _________________________________________________________________ Generator-Hidden-Layer-5 (Co (None, 256, 256, 32) 18464 _________________________________________________________________ leaky_re_lu_3 (LeakyReLU) (None, 256, 256, 32) 0 _________________________________________________________________ Generator-Output-Layer (Conv (None, 256, 256, 3) 867 ================================================================= Total params: 14,806,339 Trainable params: 14,806,339 Non-trainable params: 0 _________________________________________________________________
# create tmp directory
!mkdir ../tmp
# plot model diagram
plot_model(generator, to_file="../tmp/gen_model.png", show_shapes=True, show_layer_names=True)
# create vector of random noise to pass through the generator to see what the output is without the network having been trained
noise = tf.random.normal([1, 100])
with strategy.scope():
generated_image = generator(noise, training=False)
plt.imshow(generated_image[0, :, :, 0])
<matplotlib.image.AxesImage at 0x7fb36c7f5f50>
The discriminator will be trained to learn to tell the difference between images comes from the dataset and images comes from the generator.
Let’s now define the model architect:
# create a function to build the discriminator model
def create_discriminator():
model = Sequential(name="Discriminator") # Model
# Hidden Layer 1
model.add(Conv2D(filters=32, kernel_size=(3,3), strides=(2, 2), padding='same', input_shape=[256, 256, 3], name='Discriminator-Hidden-Layer-1'))
model.add(LeakyReLU(alpha=0.2, name='Discriminator-Hidden-Layer-Activation-1'))
# Hidden Layer 2
model.add(Conv2D(filters=64, kernel_size=(3,3), strides=(2, 2), padding='same', name='Discriminator-Hidden-Layer-2'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2, name='Discriminator-Hidden-Layer-Activation-2'))
# Hidden Layer 3
model.add(Conv2D(filters=128, kernel_size=(3,3), strides=(2, 2), padding='same', name='Discriminator-Hidden-Layer-3'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2, name='Discriminator-Hidden-Layer-Activation-3'))
# Hidden Layer 4
model.add(Conv2D(filters=256, kernel_size=(3,3), strides=(2, 2), padding='same', name='Discriminator-Hidden-Layer-4'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2, name='Discriminator-Hidden-Layer-Activation-4'))
# Hidden Layer 5
model.add(Conv2D(filters=512, kernel_size=(3,3), strides=(2, 2), padding='same', name='Discriminator-Hidden-Layer-5'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2, name='Discriminator-Hidden-Layer-Activation-5'))
# Flatten and Output Layers
model.add(Flatten(name='Discriminator-Flatten-Layer')) # Flatten the shape
model.add(Dropout(0.3, name='Discriminator-Flatten-Layer-Dropout')) # Randomly drop some connections for better generalization
model.add(Dense(1, activation='sigmoid', name='Discriminator-Output-Layer')) # Output Layer
return model
# Use the noise vector to create an image. The generator is still untrained here!
with strategy.scope():
discriminator = create_discriminator()
# Show model summary
discriminator.summary()
Model: "Discriminator" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= Discriminator-Hidden-Layer-1 (None, 128, 128, 32) 896 _________________________________________________________________ Discriminator-Hidden-Layer-A (None, 128, 128, 32) 0 _________________________________________________________________ Discriminator-Hidden-Layer-2 (None, 64, 64, 64) 18496 _________________________________________________________________ batch_normalization (BatchNo (None, 64, 64, 64) 256 _________________________________________________________________ Discriminator-Hidden-Layer-A (None, 64, 64, 64) 0 _________________________________________________________________ Discriminator-Hidden-Layer-3 (None, 32, 32, 128) 73856 _________________________________________________________________ batch_normalization_1 (Batch (None, 32, 32, 128) 512 _________________________________________________________________ Discriminator-Hidden-Layer-A (None, 32, 32, 128) 0 _________________________________________________________________ Discriminator-Hidden-Layer-4 (None, 16, 16, 256) 295168 _________________________________________________________________ batch_normalization_2 (Batch (None, 16, 16, 256) 1024 _________________________________________________________________ Discriminator-Hidden-Layer-A (None, 16, 16, 256) 0 _________________________________________________________________ Discriminator-Hidden-Layer-5 (None, 8, 8, 512) 1180160 _________________________________________________________________ batch_normalization_3 (Batch (None, 8, 8, 512) 2048 _________________________________________________________________ Discriminator-Hidden-Layer-A (None, 8, 8, 512) 0 _________________________________________________________________ Discriminator-Flatten-Layer (None, 32768) 0 _________________________________________________________________ Discriminator-Flatten-Layer- (None, 32768) 0 _________________________________________________________________ Discriminator-Output-Layer ( (None, 1) 32769 ================================================================= Total params: 1,605,185 Trainable params: 1,603,265 Non-trainable params: 1,920 _________________________________________________________________
# plot model diagram
plot_model(discriminator, to_file="../tmp/disc_model.png", show_shapes=True, show_layer_names=True)
# Use the discriminator to classify the image above (1 for real and 0 for fake)
with strategy.scope():
decision = discriminator(generated_image)
print(decision)
tf.Tensor([[0.5000076]], shape=(1, 1), dtype=float32)
From the result above, we can see that since the decision is not greater than 0.5 which is closer to 0 so the image is fake.
# create loss function for the generator
with strategy.scope():
def generator_loss(fake_output):
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True, reduction=tf.keras.losses.Reduction.NONE)
return cross_entropy(tf.ones_like(fake_output), fake_output)
# create loss function for the discriminator
def discriminator_loss(real_output, fake_output):
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True, reduction=tf.keras.losses.Reduction.NONE)
real_loss = cross_entropy(tf.ones_like(real_output), real_output)
fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
total_loss = real_loss + fake_loss
return total_loss
# Create two separate optimizers for the generator and discriminator
with strategy.scope():
generator_optimizer = tf.keras.optimizers.Adam(learning_rate=0.0002, beta_1=0.5)
discriminator_optimizer = tf.keras.optimizers.Adam(learning_rate=0.0002, beta_1=0.5)
The training loop begins with generator receiving a random noise as input. That noise is used to produce an image. The discriminator is then used to classify real images (drawn from the training set) and fakes images (produced by the generator). The loss is calculated for each of these models, and the gradients are used to update the generator and discriminator.
Here, we create a DCGAN_model which is comprised of:
+ train() is a function that performs the training for the generator and discriminator.
+ generate_images() is a function to generate images from noise using generator.
+ generate_and_plot_images() function generates images from the generator and visualize them.
+ train_loop(): finally, we will loop that alternates between training the generator and discriminator for a given number of epochs, print running time and mean loss for every 200 epochs.
Using the tf.function() to improve the performance of TensorFlow code.
# Set the hyperparameters to be used for training
EPOCHS = 1000
BATCH_SIZE = 32
noise_dim = 100
shape_dim = [256,256,3]
class DCGAN_model:
def __init__(self, noise_dim, EPOCHS, BATCH_SIZE, generator, discriminator, dataset):
self.noise_dim = noise_dim
self.EPOCHS = EPOCHS
self.BATCH_SIZE = BATCH_SIZE
self.generator = generator
#self.discriminator = discriminator2
self.dataset = dataset
@tf.function
def train(self, images):
# Create random noise vector
noise = tf.random.normal([images.shape[0], noise_dim])
with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
# generate images use random noise vector
generated_images = self.generator(noise, training=True)
# use discriminator to evaluate the real and fake images
real_output = discriminator(images, training=True)
fake_output = discriminator(generated_images, training=True)
# compute generator loss and discriminator loss
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)
# Compute gradients
gradients_of_generator = gen_tape.gradient(gen_loss, self.generator.trainable_variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)
# Update optimizers
generator_optimizer.apply_gradients(zip(gradients_of_generator, self.generator.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))
return (gen_loss + disc_loss) * 0.5
@tf.function
def distributed_train(self, images):
per_replica_losses = strategy.run(self.train, args=(images,))
return strategy.reduce(tf.distribute.ReduceOp.MEAN, per_replica_losses, axis=None)
def generate_images(self):
noise = tf.random.normal([self.BATCH_SIZE, self.noise_dim])
predictions = self.generator.predict(noise)
return predictions
def generate_and_plot_images(self):
image = self.generate_images()
gen_imgs = 0.5 * image + 0.5
fig = plt.figure(figsize=(10, 10))
for i in range(25):
plt.subplot(5, 5, i+1)
plt.imshow(gen_imgs[i, :, :, :])
plt.axis('off')
plt.show()
def train_loop(self):
e_ls = []
mean_ls = []
for epoch in range(self.EPOCHS):
start = time.time()
total_loss = 0.0
num_batches = 0
for image_batch in self.dataset:
loss = self.distributed_train(image_batch)
total_loss += tf.reduce_mean(loss)
num_batches += 1
mean_loss = total_loss / num_batches
if (epoch+1) % 200 == 0:
print ('Time for epoch {} is {} sec, mean loss is {}'.format(epoch + 1, time.time()-start, mean_loss))
self.generate_and_plot_images()
e_ls.append(epoch+1)
mean_ls.append(mean_loss)
print("\nMean Loss for every 200 epochs: \n")
table = pd.DataFrame({"Epoch": e_ls, "Mean Loss": np.array(mean_ls)})
return table
# train, visualize and print out the result for DCGAN model
gan1 = DCGAN_model(noise_dim, EPOCHS, BATCH_SIZE, generator, discriminator, monet_ds)
res1 = gan1.train_loop()
res1
Time for epoch 200 is 0.8098070621490479 sec, mean loss is 1.188346266746521
Time for epoch 400 is 0.8173971176147461 sec, mean loss is 1.8819442987442017
Time for epoch 600 is 0.8098881244659424 sec, mean loss is 2.7338266372680664
Time for epoch 800 is 0.8240690231323242 sec, mean loss is 2.6023452281951904
Time for epoch 1000 is 0.8090136051177979 sec, mean loss is 2.5838356018066406
Mean Loss for every 200 epochs:
| Epoch | Mean Loss | |
|---|---|---|
| 0 | 200 | 1.188346 |
| 1 | 400 | 1.881944 |
| 2 | 600 | 2.733827 |
| 3 | 800 | 2.602345 |
| 4 | 1000 | 2.583836 |
To tune DCGAN model, instead of using LeakyReLU activation: alpha=0.32 as above, now, I would like to create new two generators with LeakyReLU activation: alpha=0.3 and alpha=0.4.
Architecture:
# create a function to build the generator model
def create_generator2():
model = Sequential(name="Generator2")
# Hidden Layer 1: Start with 16 x 16 image
n_nodes = 16 * 16 * 512 # number of nodes in the first hidden layer
model.add(Dense(n_nodes, input_shape=(100,), name='Generator2-Hidden-Layer-1'))
model.add(Reshape((16, 16, 512), name='Generator2-Hidden-Layer-Reshape-1'))
# Hidden Layer 2: Upsample to 32 x 32
model.add(Conv2DTranspose(filters=256, kernel_size=(3, 3), strides=(2, 2), padding='same', name='Generator2-Hidden-Layer-2'))
model.add(LeakyReLU(alpha=0.3))
# Hidden Layer 3: Upsample to 64 x 64
model.add(Conv2DTranspose(filters=128, kernel_size=(3, 3), strides=(2, 2), padding='same', name='Generator2-Hidden-Layer-3'))
model.add(LeakyReLU(alpha=0.3))
# Hidden Layer 4: Upsample to 128 x 128
model.add(Conv2DTranspose(filters=64, kernel_size=(3, 3), strides=(2, 2), padding='same', name='Generator2-Hidden-Layer-4'))
model.add(LeakyReLU(alpha=0.3))
# Hidden Layer 5: Upsample to 256 x 256
model.add(Conv2DTranspose(filters=32, kernel_size=(3, 3), strides=(2, 2), padding='same', name='Generator2-Hidden-Layer-5'))
model.add(LeakyReLU(alpha=0.3))
# Output Layer: we use 3 filters because we have 3 channels for a color image.
model.add(Conv2DTranspose(3, kernel_size=(3, 3), activation='tanh', strides=(1, 1), padding='same', name='Generator2-Output-Layer'))
return model
# Use the noise vector to create an image. The generator is still untrained here!
with strategy.scope():
generator2 = create_generator2()
# train, visualize and print out the result for DCGAN model 2
gan2 = DCGAN_model(noise_dim, EPOCHS, BATCH_SIZE, generator2, discriminator, monet_ds)
res2 = gan2.train_loop()
res2
Time for epoch 200 is 0.8246738910675049 sec, mean loss is 3.549264907836914
Time for epoch 400 is 0.8136599063873291 sec, mean loss is 3.1389963626861572
Time for epoch 600 is 0.8165440559387207 sec, mean loss is 4.259570598602295
Time for epoch 800 is 0.8131566047668457 sec, mean loss is 3.706824541091919
Time for epoch 1000 is 0.8097562789916992 sec, mean loss is 3.2788479328155518
Mean Loss for every 200 epochs:
| Epoch | Mean Loss | |
|---|---|---|
| 0 | 200 | 3.549265 |
| 1 | 400 | 3.138996 |
| 2 | 600 | 4.259571 |
| 3 | 800 | 3.706825 |
| 4 | 1000 | 3.278848 |
Architecture:
# create a function to build the generator model
def create_generator3():
model = Sequential(name="Generator3")
# Hidden Layer 1: Start with 16 x 16 image
n_nodes = 16 * 16 * 512 # number of nodes in the first hidden layer
model.add(Dense(n_nodes, input_shape=(100,), name='Generator3-Hidden-Layer-1'))
model.add(Reshape((16, 16, 512), name='Generator3-Hidden-Layer-Reshape-1'))
# Hidden Layer 2: Upsample to 32 x 32
model.add(Conv2DTranspose(filters=256, kernel_size=(3, 3), strides=(2, 2), padding='same', name='Generator3-Hidden-Layer-2'))
model.add(LeakyReLU(alpha=0.4))
# Hidden Layer 3: Upsample to 64 x 64
model.add(Conv2DTranspose(filters=128, kernel_size=(3, 3), strides=(2, 2), padding='same', name='Generator3-Hidden-Layer-3'))
model.add(LeakyReLU(alpha=0.4))
# Hidden Layer 4: Upsample to 128 x 128
model.add(Conv2DTranspose(filters=64, kernel_size=(3, 3), strides=(2, 2), padding='same', name='Generator3-Hidden-Layer-4'))
model.add(LeakyReLU(alpha=0.4))
# Hidden Layer 5: Upsample to 256 x 256
model.add(Conv2DTranspose(filters=32, kernel_size=(3, 3), strides=(2, 2), padding='same', name='Generator3-Hidden-Layer-5'))
model.add(LeakyReLU(alpha=0.4))
# Output Layer: we use 3 filters because we have 3 channels for a color image.
model.add(Conv2DTranspose(3, kernel_size=(3, 3), activation='tanh', strides=(1, 1), padding='same', name='Generator3-Output-Layer'))
return model
# Use the noise vector to create an image. The generator is still untrained here!
with strategy.scope():
generator3 = create_generator3()
# train, visualize and print out the result for DCGAN model 2
gan3 = DCGAN_model(noise_dim, EPOCHS, BATCH_SIZE, generator3, discriminator, monet_ds)
res3 = gan3.train_loop()
res3
Time for epoch 200 is 0.8352234363555908 sec, mean loss is 6.185884475708008
Time for epoch 400 is 0.8261730670928955 sec, mean loss is 3.475982666015625
Time for epoch 600 is 0.8074793815612793 sec, mean loss is 4.628479480743408
Time for epoch 800 is 0.8118836879730225 sec, mean loss is 6.122518062591553
Time for epoch 1000 is 0.8132047653198242 sec, mean loss is 5.27023983001709
Mean Loss for every 200 epochs:
| Epoch | Mean Loss | |
|---|---|---|
| 0 | 200 | 6.185884 |
| 1 | 400 | 3.475983 |
| 2 | 600 | 4.628479 |
| 3 | 800 | 6.122518 |
| 4 | 1000 | 5.270240 |
# create compare table
table = {"Model": ["DCGAN 1", "DCGAN 2", "DCGAN 3"],
"Mean_Loss": [res1["Mean Loss"].mean(), res2["Mean Loss"].mean(), res3["Mean Loss"].mean()]}
compare_table = pd.DataFrame(table)
print("\nComparing three DCGAN models: \n")
compare_table
Comparing three DCGAN models:
| Model | Mean_Loss | |
|---|---|---|
| 0 | DCGAN 1 | 2.198060 |
| 1 | DCGAN 2 | 3.586701 |
| 2 | DCGAN 3 | 5.136621 |
Looking at the result above, we con conclude that in this case, DCGAN 1 model with the Generator of LeakyReLU activation: alpha=0.2 has the best performance with the lowest mean loss. Thus, I will use DCGAN 1 to generate 7000 images of Monet-style.
# Create new directory
!mkdir ../images
# generate 7000 images
start = time.time()
with strategy.scope():
for i in range(7000):
noise = tf.random.normal([BATCH_SIZE, noise_dim])
img = generator.predict(noise)
img = 0.5 * img + 0.5
img = (img * 255).astype('uint8')
img = Image.fromarray(img[0, :, :, :])
img.save("../images/" + str(i) + ".jpg")
print('Total running time is {} sec'.format(time.time()-start))
Total running time is 3160.128660917282 sec
# view some submission DCGAN_images
fig = plt.figure(figsize=(10, 10))
for i in range(25):
plt.subplot(5, 5, i+1)
plt.imshow(PIL.Image.open("../images/" + str(i) + ".jpg"))
plt.axis('off')
plt.show()
# Create a zip file of the images
#!zip -q -r images.zip ../images
shutil.make_archive("/kaggle/working/images", 'zip', "/kaggle/images")
'/kaggle/working/images.zip'
The goal of this project is to generate 7000 images of monet-style using DCGAN models. There are 5 parts:
(1) Brief description of the problem and data
(2) Exploratory Data Analysis
(3) Building and training DCGAN model
(4) Result and Analysis
(5) Conclusion and Takeaways
DCGAN, or Deep Convolutional Generative Adversarial Networks, is a type of generative model that can learn to generate new images by training on a dataset of existing images. DCGANs have shown impressive results in generating realistic images of faces, animals, landscapes, and other objects. Training a DCGAN model requires a significant amount of computational resources and can take a long time, depending on the size of the dataset and the complexity of the model. It is also important to carefully tune the hyperparameters to achieve the best possible results such as:
+ add more layers and different types of layers and see the effect on the training time and the stability of the training
+ change the number of filters
+ adjust the activation functions
+ adjust the learning rate: a high learning rate can cause the model to overshoot the optimal weights, while a low learning rate can result in slow convergence.
+ add regularization techniques such as dropout, weight decay, or spectral normalization can be used to reduce overfitting and improve the generalization performance of the DCGAN.